Theory for the Emergence of Modularity in Complex Systems 

Jeong-Man Park and Michael W. Deem 
Department of Physics & Astronomy 
Rice University, Houston, TX 77005-1892, USA 
Department of Physics, The Catholic University of Korea, Bucheon 420-743, Korea 

Abstract 

Biological systems are modular, and this modularity evolves over time and in different environ- 
ments. A number of observations have been made of increased modularity in biological systems 
under increased environmental pressure. We here develop a theory for the dynamics of modularity 
in these systems. We find a principle of least action for the evolved modularity at long times. In 
addition, we find a fluctuation dissipation relation for the rate of change of modularity at short 
times. 

PACS numbers: 87.10.-e, 87.15.A-, 87.23.Kg 



1 



Biological systems have long been recognized to be modular. In 1942 Waddington pre- 
sented his now classic description of a canalized landscape for development, in which minor 
perturbations do not disrupt the function of developmental modules In 1961 H. A. 
Simon described how biological systems are more efficiently evolved and are more stable if 
they are modular A seminal paper by Hartwell et al. firmly established the concept of 
modularity in cell biology js|. Systems biology has since provided a wealth of examples of 
modular cellular circuits, including metabolic circuits (4] and modules on different scales, 



i.e. modules of modules 



modular 



5|. Protein-Protein interaction networks have been observed to be 



8j. Ecological food webs have been found to be modu 



ar {9]. The gene regula- 
tory network of the developmental pathway exhibits modules 10|, L3JJ , and the developmental 
pathway is modular Ijj]. Modules have even been found in physiology, specifically in spatial 
correlations of brain activity I3I, [ijj] . 

The modularity of a complex biological system can not only be quantified, but also 
change over time. There are a number of demonstrations of the evolution of modularity in 
biological systems. For example, the modularity of the protein-protein interaction network 
significantly increases whea yeas t are e^ed to heat shoe, Q, aad the m oda,a n t y „ £ the 



protein-protein networks in both yeast and E. coli appears to have increased over evolu- 
tionary time 16J. Additionally, food webs in low-energy, stressful environments are more 
modular than those in plentiful environments 171 . arid ecologies are more modular during 

n □ 

droughts [181 ] . and foraging of sea otters is more modular when food is limiting [19J. The 
modularity of social networks changes over time: stock brokers instant messaging networks 
are more modular under stressful market conditions 20] , socio-economic community overlap 
decreases with increasing stress, and criminal networks are more modular under increased 



police pressure 



211 ] . Modularity of financial networks changes over time: the modularity of 



the world trade network has decreased over the last 40 years, leading to increased suscepti- 



bility to recessionary shocks 



221 ] . and increased modularity has been suggested as a way to 



increase the robustness and adaptability of the banking system 23]. 

In an effort to explain some of these observations, we here present a theory to describe 
dynamics of modularity. This analytical th eory complements numerical models that have 
investigated the dynamics of modularity 2414271] . We assume that modularity can be quan- 



tified in the system under study. We further consider that modularity is a good order 
parameter to describe the state of the system. That is, we project the dynamics onto the 



2 



slow mode of modularity, M. We then consider the equations of motion for the modularity 
of the system. In particular, we consider an ensemble of systems, each with different values 
of the modularity, and each evolving. The evolutionary dynamics of this system is fully 
specified by the rate at which systems reproduce, /, termed "fitness," and the rate at which 
changes of modularity arise, //. Since the state of the system is specified by the slow variable 
M, the fitness is a function of the modularity, / = f(M). The f(M) function is from a 
detailed calculation, numerical simulation, or experimental observation. 

We further consider that there is a pressure on this ensemble of systems to have an 
efficient response function. A canonical form of this pressure is a changing environment. 
As the environment changes, the favorable niches for the system change, and the system 
must adapt to the changing landscape. The more rapidly the environment changes or the 
more dramatically the environment changes, the more pressure there is on the system to be 
adaptable. As noted above, it has been widely observed that systems under pressure tend 
to become more modular. If we denote the rate of change of environment as 1/T and the 
magnitude of the change as p, the mean fitness of the population of systems will depend on 
these parameters, as well as the modularity: / = f p ^{M). Evolution of modularity depends 
on how the response function of the system varies with these parameters. Since systems 
under stress tend to become more modular, it is reasonable to assume that the population 
average fitness for a modular system is greater than that for a non-modular system, at 
least for small T or large p where stress is large. This behavior has been observed in a 
model system evolving in a changing environment, when horizontal gene transfer is included 
241 ] . We also note that this canonical behavior has also been observed in energy relaxation 
dynamics of spin glass models of different sizes 28j . Glassy evolutionary dynamics has been 



noted a number of times 



29 



30] . Conversely, at long times, the non- modular system should 



have a higher fitness, because modularity is a constraint on the optima that can be achieved. 
This is the reason for the crossing of the solid and dashed curves in Fig. [TJ We here take this 
function / = f p ^{M) as input. We assume only that this function for large M and small M 
looks like the dashed and solid curves in Fig. [TJ Putting these points together, we expect the 
emergence of modularity at small p or large T. We seek a theory to quantitatively describe 
this emergence. For very slow rates of environmental change, the system can relax to the 
nearly optimal configuration, which is unlikely to be modular, as modularity is a restriction 
on the system. Thus, we expect the population average fitness for the non-modular system 



3 



to be greater than that for a modular system for large T or small p. 

To proceed further, we define the "connection matrix" for our system. The connection 
matrix gives the links between the nodes of the network. For example, in the protein-protein 
interaction network, the nodes are the proteins and the links tell one whether protein i 
interacts with protein j. The connection matrix A^ is a binary matrix which denotes whether 
nodes % and j interact (A^ = 1) or not (Ay = 0). In the criminal network, the connection 
matrix denotes whether criminal % interacts with criminal j. The fitness function that 
underlies the detailed dynamics which define / Pi y(M) may well have non-trivial couplings 
between nodes 24j . and the connection matrix is the projection of the non-zero couplings. 
We assume that each node is connected to C other nodes on average. The number of nodes 
is denoted by N. Rearrangement of the ones within this matrix changes the modularity of 
the matrix. For simplicity, we assume that the modules which form are of size L. Thus 
a modular system will have excess of connections along the L x L block diagonals of the 
connection matrix. In other words, the probability of a connection is Cq/N outside the block 
diagonals when \i/L\ ^ \j/L\ and C\/N inside the block diagonals when \i/L\ = [j/L\, 
with C = Co + (C\ — Cq)L/N. Modularity is defined by the excess of connections in the 
block diagonals, over that observed outside the block diagonals: M = (C\ — Cq)L/(NC). 

If our population of systems is large, i.e. we have a large biological population size, the 
probability distribution to have a matrix with modularity m obeys 



dP m {t') 
dt' 



[f P ,TM-(f)]P m (t') + fiC^ 



m) 



L\ 1 

1 -n) + n 



Pm—l/ (N—L) 

(0 



m + ( 1 - m )iv + iv 



p 



m+l/(N-L) 



(0 



-VC (l - (m + 2(1 - m)^\ P m (t') 



(1) 



where m takes values -L/(N-L), (-L + 1)/(N-L), (-L + 2)/(N - L), . . . , 1. The average 
fitness is given by {f(t')) = J2 m fp,T( m )Pm(t'). The average modularity as a function of 
time is given by M{t') = Ylrn m ^ > m{t')- Multiplying this equation by m and summing, we 
find that the rate of change of modularity satisfies 



M' = (mf(m)) - M(f) - fiCM/N 



(2) 



Here M is the average modularity of the population, and m is the modularity for any 
particular matrix in the population, i.e. M = (m). Biologists would term this equation a 



continuous-time Price equation [3l|, and we will show below that this equation implies a 
type of useful fluctuation-dissipation theorem. 

For large values of N, for which the changes in M are nearly continuous, the average 



fitness implied by Eq. ([TJ 



quantum field theory 3jj[33|. The average modularity follows a dynamical trajectory away 
from an initial state to a final steady state value. The remarkable result from this derivation 
is that the modularity which emerges at long times obeys 

/ pop = max{/ P)T (£) - fiC[(N - L)L/N 2 }[2 + (N/L - 2)£ - 2^(1 - 0(1 + (N/L - 1)0] }(3) 

with modularity determined by the solution of the implicit equation 

f PjT (M) = /p p (4) 

Here / pop is the mean population fitness divided by N. Thus, a principle of least action 
gives the evolved modularity. 

While Eq. fl3]) is a general result, we can proceed further in the limit that evolved mod- 
ularities are small. Expanding for small M, we find 

2L [df PjT /dM\ M=0 ] 



at long times may be determined by techniques borrowed from 



liC(N - L) -2L [d 2 f p , T /dM 



21 



\M=0\ 



f L[df p , T /dM\ M=0 } 2 
Jpop ^C(N — L) — 2L [d 2 f p , T /dM 2 \ M=0 ] + 
M = L [df PtT /dM\ M=0 ] 

fiC(N — L) — 2L [rf 2 / P ,T/rfM2| M=0 ] I ) 

Thus, as long as a modular system has a higher fitness, modularity will spontaneously emerge 
for large enough system sizes, N. 

We now derive a relationship between the rate of growth of modularity and the en- 
vironmental pressure. Let us say that the fitness for small values of modularity can be 
expressed as f{m)/N — f + mAf. Equation (|2J) becomes M' = crj^Af — /xCM/N, where 
a M = ( m2 ) — M 2 . For small L/N, this equation combined with Eq. [5] implies that at steady 
state a\j = L/N 2 . Let us investigate the growth of modularity from an initially non- 
modular state. The value of A/ depends on p. If p = 0, the environment is not changing, 
and the system will stay in the M = state with A/ = 0. If p = 1 then A/ = /i — /o ~ /i be- 
cause only the modular system can evolve significantly during the time T on the completely 
randomized, new landscape. Making a linear interpolation, we find A/ w pf\{t c ) = pfo(t c ), 




400 
Time 



FIG. 1: Shown is the fitness of an evolving system. The fitness of the non-modular (/q, solid), 
modular (/a/, dot-dashed), and block-diagonal (/i, dashed) systems are shown. The modularity 
calculated from Eq. [3] is shown (dotted). Also shown is the result for small M, Eq. [5j to first order 
in L/N (short dashed). In this example N = 120, L = 10, fi = 0.01, and C = 12. In this case 
t c w 285. 

where t c is the time at which the non-modular and fully modular curves cross. We thus find 
M' ~ cr1iPfo{t c )- Reverting back to real time, i.e. t = Tt' because there is "fast" dynamics 
that occurs between each environmental change of duration T, we find 

1 dM 

PE = R-dT (6) 

where pe = p/T is the environmental pressure, and R = o"a//o(^c)- 

This Eq. fl6]) follows from the principle of least action ([3]), the dynamic generalization 
of it, Eq. (J2J), and the response function of the modular system being greater than that 
of the non-modular system at short times. Equation may be interpreted as a Taylor 
series expansion of dM/dt in allowed combinations of p and 1/T. Alternatively, Eq. (JH]) 
may be interpreted as the linear response of the modularity to the environmental pressure. 
The coefficient R is a measure of ruggedness, since R is proportional to the variance of 
the modularity, which is expected to be related to the ruggedness of the landscape. The 



coefficient R is also expected to be related to replicate variability in experiments [341 ] . 

What does this theory mean? Equation (JH]) says that an increase of environmental pres- 
sure should lead to the evolution of systems with increased modularity. A study of 117 
species of bacteria showed that the modularity of the bacteria's metabolic networks in- 



creased monotonically with variability of the environment in which the bacteria lived 



35] 



Metabolic networks of pathogens alternating between hosts were found to be more modular 



361 ] . A number of other examples were mentioned in 



than those of single-host pathogens 
the introduction. 

The present theory should allow the analysis of complex, evolving systems to go beyond 
a demonstration of the existence of modularity to a quantitative analysis of the dynamics 
of modularity. 



[1] C. H. Waddington, Nature 150, 563 (1942). 

[2] H. A. Simon, Proc. Amer. Phil. Soc. 106, 467 (1962). 

[3] L. H. Hartwell, J. J. Hopfield, S. Leibler, and A. W. Murray, Nature 402, C47 (1999). 
[4] E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, and A.-L. Barabasi, Science 297, 1551 
(2002), URL |http : //www . sciencemag . org/cgi/content/abstract/297/5586/1551| 



[5] M. R. da Silva, H. Ma, and A.-P. Zeng, Pr. Inst. Electr. Elect. 96, 1411 (2008), URL 
|http : / / ieeexplore . ieee . org/ xpl/f reeabs_all . j sp?arnumber=4567408. 



[6] V. Spirin and L. A. Mirny, Proc. Natl. Acad. Sci. USA 100, 12123 (2003), URL 
|http: //www.pnas . org/cgi/content/abstract/ 100/21/ 12 123. 



[7] A.-C. Gavin, P. Aloy, P. Grandi, R. Krause, M. Boesche, M. Marzioch, C. Rau, 
L. J. Jensen, S. Bastuck, B. Diimpelfeld, et al., Nature 440, 631 (2006), URL 
|http : / / www . ncbi . nlm . nih . gov/pubmed/ 16429126, 



[8] C. von Mering, E. M. Zdobnov, S. Tsoka, F. D. Ciccarelli, J. B. Pereira-Leal, 
C. A. Ouzounis, and P. Bork, Proc. Natl. Acad. Sci. USA 100, 15428 (2003), URL 
|http: //www.pnas . org/cgi/content/abstract/ 100/26/ 15 428. 



[9] A. E. Krause, K. A. Frank, D. M. Mason, R. E. Ulanowicz, and W. W. Taylor, Nature 426, 



282 (2003), URL http : //www . ncbi . nlm . nih . gov/pubmed/ 14628050 



[10] E. C. Raff and R. A. Raff, Evol. Dev. 2, 235 (2000), URL 
http : //www3 . interscience . wiley . com/journa 1/119052048/abstractl 

[11] G. P. Wagner, Integr. Comp. Biol. 36, 36 (1996), URL 
http: //icb. oxf ordjournals . org/cgi/content/abstract/36/1/36, 



[12] C. P. Klingenberg, Annu. Rev. Ecol. Evol. S. 39, 115 (2008), URL 
http : //search. ebscohost . com/login. aspx?direct=true&db=a9h&AN=35967311, 



[13] D. Meunier, S. Achard, A. Morcom, and E. Bullmore, Neurolmage 44, 715 (2009), URL 
|http : / / www . ncbi . nlm . nih . gov / pubmed/ 19027073, 



[14] M. Chavez, M. Valencia, V. Navarro, V. Latora, and J. Martinerie, Phys. Rev. Lett. 104, 



118701 (2010), URL http: //prl . aps . org/abstract/PRL/vl04/ill/ell8701 



[15] A. Mihalik and P. Csermely, PLoS Comput. Biol. 7, el002187 (2011). 

[16] J. He, J. Sun, and M. W. Deem, Phys. Rev. E 79, 031907 (2009). 

[17] D. M. Lorenz, A. Jeng, and M. W. Deem, Phys. Life Rev. 8, 129 (2011). 

[18] M. Rietkerk, S. C. Dekker, P. C. de Ruiter, and J. van de Koppel, Science 305, 1926 (2004). 

[19] M. T. Tinker, G. Bentall, and J. A. Estes, Proc. Natl. Acad. Sci. USA 105, 560 (2008). 

[20] S. Saavedra, K. Hagerty, and B. Uzzi, Proc. Natl. Acad. Sci. USA 108, 5296 (2011). 

[21] M. Kenney, in Networked politics: Agency, power, and governance, edited by M. Kahler 

(Cornell University Press, 2009), pp. 79-102. 
[22] J. He and M. W. Deem, Phys. Rev. Lett. 105, 198701 (2010), URL 

|http://prl . aps . org/ abstract/PRL/vl05/ il9/ el98701, 



[23] A. G. Haldane and R. M. May, Nature 469, 351 (2011). 

[24] J. Sun and M. W. Deem, Phys. Rev. Lett. 99, 228107 (2007). 

[25] H. Lipson, J. B. Pollack, and N. P. Suh, Evolution 56, 1549 (2002). 

[26] N. Kashtan and U. Alon, Proc. Natl. Acad. Sci. USA 102, 13773 (2005). 

[27] N. Kashtan, E. Noor, and U. Alon, Proc. Natl. Acad. Sci. USA 104, 13711 (2007). 

[28] W. Kinzel, Phys. Rev. B 33, 5086 (1986). 

[29] B. S. Khatri, T. C. McLeish, and R. P. Sear, Proc. Natl. Acad. Sci. USA 1006, 9564 (2009). 

[30] K. Vetsigian, C. Woese, and N. Goldenfeld, Proc. Natl. Acad. Sci. USA 103, 10696 (2006). 

[31] G. R. Price, Nature 227, 520 (1970). 

[32] L. Peliti, Europhys. Lett. 57, 745 (2002). 

[33] J.-M. Park and M. W. Deem, J. Stat. Phys. 125, 975 (2006). 

[34] T. F. Cooper and R. E. Lenski, BMC Evol. Biol. 10, lell (2010). 

[35] M. Parter, N. Kashtan, and U. Alon, BMC Evol. Biol. 7, 169 (2007). 

[36] A. Kreimer, E. Borenstein, U. Gophna, and E. Ruppin, Proc. Natl. Acad. Sci. USA 105, 6976 
(2008). 



8 



